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The knowledge discovery from student’s data can be very useful in 
predicting the employment under different categories. The machine learning 
is helping in this regard up to the great extent. In this paper, a hybrid model 
of machine learning has proposed to predict the jobs categories, students 
may get in their campus placement. The considered groups of students are 


from undergraduate courses from engineering stream having the semester’s 
scheme in their academic. The mapping of jobs has predicted based on their 
Keywords: previous seven semesters marks as well as their personality index. The 
proposed hybrid model consists of three different model based on multilayer 
feed forward architecture, radial basis function neural network and K-means 
based clustering method. The proposed model provided the relative chances 
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1. INTRODUCTION 

Machine learning based system of predictive analytics, can play a very important role in assisting 
student employability outcomes. Powered by policymakers, policy promoters, paternities, society, and 
currently sights higher education as the most important means to an entry-level job [1]. As an outcome, 
education Institutions are progressively more held responsible for employability product-making it an 
importance for career centers to find actual and accessible methods for student employability governance. 
The capability to expect which students will and will not be hired and what category of job will assign to 
individual can help a lot to recruiter as well as to students itself. Employability predictions can advise career 
advisors on accurately how much internship, non-collegiate activities, other types of activities, and most 
likely subjects which lead to a job offer and how all these aspects work together. Prediction of employability 
of a student’s is happening today. However, we have just begun to use to the new common of data analytics, 
the game is changing. One quick evolution is machine learning: a method of artificial intelligence that takes 
the intuitions from data through pattern recognition to predict upcoming outcomes. Machine learning can 
study huge, difficult datasets while providing correct results, faster than human. The initial step in machine 
learning is determining the information that you what you have to learn from your data. Later, collect and 
clean the data so that it can be analyzed. After this the analyzed data are loaded to a machine learning model. 
As soon as the machine learning model identifies patterns, fresh data can be given to it to forecast an 
outcome that is presently unknown. The machine learning capability of prediction has now become part of 
our day to day lives. The technology is used as online advertisements recommenders, controls driverless cars, 


Journal homepage: http://ijece.iaescore.com 


2784 O ISSN: 2088-8708 


detects banking scam, and predicts medical diagnoses. It is likely that machine learning will become typical 
in higher education, but the question is: “Can we accurately predict which students will be employed at 
graduation?” Resume audit study simply has a low capability to predict employability of student before 
graduation; mainly, students having higher academic performance and good skills are reflected as more 
employable. Machine learning provides an alternate and more powerful method of analysis when compared 
to static resume audits. Predicting results with career center data is still in its early stages. The research is 
composite and chaotic because of huge quantity of data which requires analysis and the newness of applying 
higher education data to the models. However, there are huge opportunities to predict results in employment 
and outside, including the factors like time to graduation, student satisfaction, retention and more. The 
potentials are boundless and holding machine learning within institution career centers is a normal starting 
point. 

In this paper students doing the undergraduate course in the different branch of engineering college 
have considered for their employability in the three different categories of jobs. The past seven semesters 
percentage marks and the parameter capturing the personality point have considered as the inputs for the 
predictive models. To make the prediction more robust, rather than having a single predictive model in the 
solution domain, two different model based on neural network multilayer perception feed forward 
architecture and radial basis function neural network have considered. K-means based clustering approach 
has also included making the solution free from mapping approximation. 

In section 2, related work in the area of use of machine learning towards employment has discussed. 
Section 3 contains the detail of proposed approach while section 4 carries the experimental results and 
analysis. Conclusion has presented at the end. 

Academic inspiration provided to students plays very important role in their academic performance. 
For any of the educational institutes, it is very important to identify the students with less academic 
performance at the earliest and motivate them to improve their performances. Development of classification 
model for predicting the student academic motivation built on their performance in learning management 
system (LMS) courses, aims to establish a link between predicted student academic motivation and their 
performance in the LMS course [1]. Cai [2] has suggested the global higher education boards on how to 
motivate and produce graduates who meets the requirements and expectations of employers. 

Schumacher et al. [3] had investigated a study which is designed to forecast the students who 
successfully completed the enrolled actuarial program using logistic regression to find the probability of 
dropouts or successfully completing actuarial program. The relation of social hierarchy to individual’s career 
choice self-efficacy is discovered in [4] using an ability of social hierarchy. Determining the elements of job 
switch and guessing the job switching time of an individual are the most inevitable methods for learning the 
careers of any individuals. But still it is very difficult to predict the job switching time due to advancement of 
workforce division and universalization which made career more dynamic. 

In [5], the association between day to day and professional life based on the experience of workers 
and their check in records is presented. Effort has given to expose to what level the job switch can be 
predicted using day to day activities of workers and their job agility. Paparrizos and Cambazoglu [6] 
presented query recommender system using supervised machine learning techniques which recommends new 
suitable job to the job seekers. Wang et al. [7] also suggested an enquiry recommender model which assists 
the users to find the data they require. With the vast use of information technology in ambient assisted living 
system, the data management model used the 3 V features of big data like volume, velocity and variety to 
find a comprehensive data generalization and to construct more universal systems. 

Mao et al. [8] it has proposed the method of big data abstraction, with metric space as a universal 
abstraction for ambient assisted living data types. Cognitive modelling can determine the hidden features of 
examinees for forecasting their scores on each problem. Cognitive modelling plays a very significant role in 
various applications, for example customized remedy recommendation; some of them are also designed for 
providing solutions for literature. Still extraction data from both specific and objective problems to obtain 
accurate and understandable cognitive analysis is underexplored. Wu et al. [9] has proposed a fuzzy 
cognitive diagnosis framework (fuzzy CDF) for victim cognitive modelling using both specific and objective 
problems. 

Thakar and Mehta [10] presented an experimental study that relates various classification algorithms 
on two datasets of masters in computer application learners composed from different affiliated colleges of 
state universities in India. Shahiria et al. [11], [12] has provided a summary on the data mining techniques 
which are used for prediction of student performance. It also provided information regarding how the 
prediction algorithms can be used to find the most significant features in student data. Nie et al. [13] 
proposed a data-madriven structure for predicting student job option upon completion of graduation based on 
their cognitive skills, soft skills that they exhibit in campus and thereby playing an vital role in assisting 
career guidance and counselling. Al-Sudani and Palaniappan [14] has used a mixture of institutional, 
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emotional, educational, demographic and commercial features in predicting the performance of students 
using multi layered neural network which classified students ‘degree into either basic or good degree. In [15] 
external features that affects the students’ performance is studied and classification model is developed using 
XGBoost classifier. The conservative numerical evaluations are limited in giving good predictions of 
educational quality of Institutions. Lau et al. [16] has presented a method with both conservative numerical 
analysis and neural network modelling for prediction of students performance. It [17], [18] tried to forecast 
the placements and results of programming competitions through machine learning without complicated 
evaluation. Descriptive variables used for machine learning are classified into different categories. 


2. PROPOSED WORK 

Neural network is a supervised learning method which is often used because of the curve fitting 
method. The network is provided with training pairs which consists of a vector from subordinate input house 
in addition with the preferred network reaction. From a defined learning formula, the network accompanies 
the variations of its weights so error between the actual and anticipated reaction is condensed compared to 
some optimization conditions. Once trained, the network performs the interruption within the output vector 
house. A nonlinear mapping between the input and also the output vector areas are often achieved with multi- 
layer perceptron neural network (MLPNN) and radial basis function neural network [19], [20]. This both 
network have been considered as universal approximation. 


2.1. Multi-layer perceptron neural network (MLPNN) 

For the Back Propagation the weight update equation is the amount of the updates which is 
proportional to the derivative of the nonlinear activation functions. An activation function for neurons is of 
sigmoid type with bell-shaped derivatives in MLPNN. Explicitly, the weights must be selected so as to lessen 
the performance criterion 


T 
Eq= ; (dq j oa) (da = XSut ) z h=1(dqn ~ tien). (1) 


where s means the number of layers present in the network and dą € R1 and xut are the derived and 

actual output, respectively of the network due to q? training pattern. Taken an activation function of the sigmoid 

type(g) the learning algorithm for MLP neural network having the three layers can be presented as followed. 

a. Adjust the weights in the network as per the typical initialization process. 

b. From set, the set of training data, develop the network response. 

c. Match the chosen network output with the fixed output of the network and the local error is calculated 
according to 


For output layer: ôf = (da — xuti )g u$) (2) 
For hidden layer: 8$ = 72, 6f**w3F* g(uz) (3) 
d. The weights of the network can be updated as 


Wij (t+1)= Wij Ct) + wS3xe es + alw; (t) — Wij (t- 1)] (4) 


out,j 
e. If network converged then Stop the iteration, else go back to step 2. 


2.2. Radial basis function neural network (RBFNN) 

The radial basis function neural network (RBFNN) design consists of 3 layers: an input layer, one 
layer of hidden layer which is nonlinear process neurons and also the output layer. The RBFNN output is 
calculated consistent with (5): 


yi = fix) Ek- Wik Px (%, Ck) = k=1 Wik Ox (IX — Cello), Y i = 1,2,...m 5) 


where xe"? is an input vector, p(.) is a function from Rt to R, ||. ||, denotes the Euclidean norm, W; 
are the weights in the output layer. N is the number of neurons in the hidden layer, andc,¢R"*tis the RBF 
centers in the output space. The Euclidean distance for each neuron in the hidden layer is calculated between 
its related centers and the input to the network. Nonlinear function of the distance is the output of hidden 
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layer. At last, the weighted sum of the hidden layer outputs is calculated as the output of the network [21]. 
The functional form of ;,(. ) is assumed to be given and is mostly Gaussian function as given by (6), 


p(x) = exp(—x?/a*) (6) 


where o parameter provides the “width” of radial basis function and is normally stated as spread parameter. 
The centers are the identified points which are expected to accomplish a suitable test group of the input 
vector space. They are typically selected as a subset of the input data [22], [23]. In Gaussian radial basis 
function, the spread parameter o is commonly set according to the below heuristic relationship. 


dmax 
R 7) 


where dmax is the maximum Euclidean distance between the identified centers and K is the number of centers. 
Using (8) gives RBF of a neuron in the hidden layer of the network 


(2, Cx) = exp (=z lix = cxll?) (8) 
error value operate as; 
j) = Fle)? = $ yan) — Th wepi), cx) 0) 


when the chosen RBF is Gaussian, (9) becomes; 


J(n) = = lye (n) — Ek-1 We (n)exp (- | 


a(n) 


The equations for updating the network parameters are given by (10); 
a 
wn + 1) = [wW = Hw Jn) >wewen] = W) + Hero) (10) 


where e(n) = y(n) — ya (n), Yan) ya (n) is the preferred network output, and uw is appropriate learning 


parameters. 

For both, MLPNN and RBFNN, initially, with the predefined parameters values, through the 
modeling equations quite lot sets of dissimilar kinds of patterns have created. Pre-processing has done with 
respect to bring down the rate of patterns around the range of [0 1]. The variety of pattern set designed, where 
every set brings the different number of patterns. An appropriate architecture is used to study the knowledge 
of pattern classification by assuming the objective as the maximum value equal to one for conforming pattern 
class and minimum value equal to zero for pattern classes [24], [25]. The mean square error is evaluated for 
the error function to select the quality of learning. To change the values of weights Gradient method is used. 


2.3. K-means 

K means algorithm is one of the unsupervised machine learning techniques and a popular clustering 
algorithm which is used to cluster data points. It selects the random points which are the center of the clusters 
denoted as K called centroids, where value of K is predetermined. Later compute the distance of each data 
points with the identified centroids are and cluster them accordingly. Repeat the process of recalculating the 
centroids and find distance between the data points and updated centroids and readjust the clusters until none 
of the data points change the clusters. The algorithm: 


1. Initialize Cluster Centroids py, H2, Hg, < =<- Ug E R” Arbitrarily 
2. Reiterate the process up to convergence: { 
For each i, fix 
c® = arg minj||x® — mh 
For every j, fix 
_ EB gfe aj 
WSO 
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3. RESEARCH METHOD 

The function block diagram of the proposed work which consists of three different algorithms has 
shown in Figure | First from the complete data set, three different data set has formed and a normalized 
version (TDS1 and TDS2) has applied to MLPNN and RBFNN architecture correspondingly while raw data 
has applied for K-means algorithm. The hidden layer and output layer weights (wh and wo) are being 
updated through the gradient decent algorithm by minimizing the mean square error value. The output layer 
weight (wo) of RBFNN have been upgraded from a separated learning process which is also based on 
gradient decent method. Once terminating criteria has been satisfied, the weight values for both algorithms 
have been stored. The K-means algorithm has applied to obtain the three different centroids corresponding to 
each category of job. Once the learning process are over for all the three algorithms, in the test phase, the 
stored values of weights and centroids values have been utilized to predict the outcomes from the individual 
process. The outcome of each process has combined to form the final prediction for individual test cases as 
shown in Figure 2. 


Targets(TDS1) 


Student 
perform ance data 
set 


Targets(TDS2)) 


Pre-processing 
(Nom alization) 


Training data set 


Figure 1. Function block diagram of hybrid model in learning phase 


Figure 2. Function block diagram of hybrid model in test phase 
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4. RESULTS AND DISCUSSION 

There was total 300 graduate student record of an engineering college who has finished their 1* to 
7" semester courses and their marks have been considered to predict the possibilities of their employment in 
different categories of jobs. There is one more parameter has included here and called as personality point 
(PP) which encompass several qualities of an individual personality like presentation skill, reaction skill, 
team work skill and other extra-curricular activities. The score of PP has considered in the range of 1 to 10. 
The features of PP are not very key parameters to make someone very suitable for the core jobs of their 
engineering discipline but surely a high value can help to be a better employee so chance to get a better job 
increased. Here three different categories of jobs have considered like Type I (very good category job: 
>8L/annum), Type II (good category job: 5 to 8 L/annum) and Type III (average category: <5L/annum). The 
marks were taken from student’s record while PP value has been assigned by the recruiters and the 
employment towards jobs has been taken from the history of recruitment over 2 year of time span by 
different companies visited over the campus. Sample of data set has shown in Table 1. Through uniform 
random process, 150 student’s data have considered as training purposes while left remaining 150 students 
data have considered for the test cases. With this process, three different data set has created for training and 
test purpose and utilized by the three algorithms as shown in Figure 3 to predict the possibilities of category 
of employment. 


Table1. Sample data from data set 


Marks 
Stu Sml Sm2 Sm3 Sm4 Sm5 Sm6 Sm7 PP Obtained job 

1 76.1 74.3 65.2 69.1 80.2 80.1 78.4 1 Type II 
2 80.3 76.2 78.1 79.4 82.2 84.1 80.3 6 Type I 

3 68.3 79.8 76.1 72.6 T32 70.0 73.0 2.0 Type II 
4 73.1 76.7 7715 78.6 79.1 75.7 75.6 6.0 Type II 
5 67.8 64.7 69.5 65.4 64.5 67.8 61.3 6.0 Type IM 
6 63.7 61.7 68.8 69.2 67.3 62.9 61.5 4.0 Type IM 
7 69.9 69.9 69.9 68.6 69.8 66.4 69.6 9.0 Type II 


MLP Neural Network 


K-Means Cluster 


Figure 3. Abstract form of proposed solution 


From the data set of 300 students record, through random sampling process three training data set of 
same size (150) has been formed. The test data set has formed by including all the data left behind after 
defining the training data set. For the MLPNN and RBFNN, the normalized data have been applied to 
prevent the operation in saturation region. For K-means clustering the normal data (without normalization) 
has been applied to increase the inter cluster distance. There are very high possibilities that all the three data 
set will have the similarity as well as diversity in their contents because of random sampling process. For 
MLPNN and RBENN, the optimal numbers of hidden layer nodes have decided through the estimation of 
performances over different possibilities of hidden nodes like 2, 5, 8 and 10. In both case the size of input 
layer contains 8 nodes while output layer has 3 nodes. The output node representing the outcomes of different 
job classes. The learning rate and momentum constant in gradient decent process for MLPNN have 
considered as 0.9 and 0.2 while for RBF the learning parameter has considered as 0.1. The obtained 
performances have shown in Figure 4. The whole process has been implemented in the MATLAB 
environment. 

It is clear from Figures 4(a) to (d) that MLPNN has shown better performance when the numbers of 
nodes were lesser than 10. In fact, there was high level of consistency can also observed in the MLPNN 
performance in compare to RBFNN. When the nodes number was 10 in the hidden layer performance of both 
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neural networks were same. Hence for MLPNN the number of hidden nodes has considered as 5 while for 
RBENN it is 10. It is also interesting to note that there were low mean square errors (MSE) in the case of 
MLPNN in compare to RBENN. The convergence characteristics for all the case of hidden nodes have shown 
in Figure 4. It can observe that there was a faster convergence for RBFNN in compare to MLPNN but quality 
of convergence was better for MLPNN. 

For the three different training data set, the performance of MLPNN, RBFNN and K-means 
algorithms have obtained for 7 independent trials to understand the consistency in delivered outcomes. The 
performance over training and test data along with overall performances over the whole data set has 
estimated. To obtain the benefit of proposed hybrid model, the outcomes over all data set have estimated. The 
obtained results have shown in Table 2. The statistical benefit of each method has estimated through the 
mean and standard deviation. It can observe that the hybrid model has delivered the high percentage of right 
result along with minimal variation in the performances over different trials. 


0.24 0.25 : r r 
| MLPNN 
0.22 - —— RBFNN |} 
0.24 0.24 4 
0.18 b 
D 0.164 a 0.15F 7 
= if 
04k — | aS E o 
0.127 \ 0.1 4 
ae 
0.14 ree 
0.08 z z : z 0.05 t r r r 
0 100 200 300 400 500 100 200 300 400 500 
Iteration No. Iteration No. 
(a) (b) 
0.25 0.25 r z 
——— MLPNN ——--—- MLPNN 
—— RBFNN —— RBFNN 
0.2; 1 0.2} i 
| 
uw 
g 0.155 | a 0.15} 
oa A i = 4 oF $ Tl 
\ _ NN 
0.05 r r fas a 0.05 r r Ñi P 
0 100 200 300 400 500 0 100 200 300 400 500 
Iteration No. Iteration No. 
(c) (d) 


Figure 4. Convergence characteristics of MLPNN and RBFNN with (a) 2 hidden nodes, (b) 5 hidden nodes, 


(c) 8 hidden nodes, and (d) 10 hidden nodes 


Table 2. Performance of different algorithms under 7 independent trials 


Trials MLPNN RBFNN K-MEANS CLUSTER Hybrid model overall 
Tr. Test Overall Tr. Test Overall Tr. Test Overall 

1 86.67 90.67 88.67 86.67 90.67 88.67 86.67 90.67 88.67 88.67 
2 86.67 90.67 88.67 77.33 76.00 76.67 86.67 90.67 88.67 88.67 
3 88.67 89.33 89.00 84.00 81.33 82.67 91.33 86.67 89.00 89.00 
4 88.00 88.67 88.33 85.33 88.67 87.00 88.67 88.00 88.33 88.33 
5 88.67 80.00 84.33 86.67 74.67 80.67 87.33 88.67 88.00 86.67 
6 88.00 84.00 86.67 85.33 92.00 88.67 47.33 50.00 48.67 88.33 
7 80.67 76.00 78.33 88.00 88.00 88.00 91.33 87.33 89.33 89.00 

Mean 86.76 85.62 86.28 84.76 84.47 84.62 82.76 83.14 82.95 88.38 

(Std.Dev) 2.60 5.35 3.58 3.51 7.10 4.69 15.75 14.69 15.12 (0.8027) 
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Performance for various individual student test cases have shown in Table 3. The beauty of 
proposed method is it just not predicting the possibility of single job category but also declare the chance of 
other category jobs if possible. For example, for the student1 the predicted job category was Type II by all 
the three algorithms hence the chance of getting the Type II job is nearly 100% while there is also choice job 
available of Type III job. For student3, there is outcome of 33.33% chance of job having Type I while there 
is 66.33% chance of Type II job appeared because of | algorithm has given the favor of Type I while other 2 
algorithms have given favor of Type II. Such kind of prediction makes the recruitment process optimal for 
the recruiters. 


Table 3. Prediction for individuals carrying the details about possibilities of employment 


Student performance Chance of job type 
Stu Smi  Sm2  Sm3  Sm4 Sm5 Sm6 Sm7 PP Type I Type II Type Il 
1 76 74 65 69 80 80 78 1 x V(100%) V 
2 80 76 7 79 82 84 80 6 V(100%) V V 
3 68.3 79.8 76.1 726 73.2 70.0 70 2.0 V(33.33%) \(66.33%) V 
4 73.1 76.7 715 78.6 791 757 756 6.0 V(33.33%) V(66.33%) V 
5 67.8 64.7 695 654 645 678 613 6.0 x V(33.33%) V(66.33%) 
6 63.7 61.7 688 69.2 67.3 62.9 61.5 4.0 x V (33.33%) V(66.33%) 
7 69.9 69.9 69.9 686 698 664 69.6 9.0 x \ (66.33%) V(33.33%) 


5. CONCLUSION 

Predicting the employment of graduate students in engineering college is a very challenging task 
specially when there is limited number of parameters included. The challenges increases further level when 
there is need of high consistency and efficiency. The proposed solutions of job prediction have used the three 
different models in combine form to deliver the final prediction. The input-output mapping relation has been 
archived with MLPNN and RBENN. It was observed that MLPNN needed lesser number of hidden nodes in 
compared to RBFNN for the same performance. Convergence characteristic was also better in MLPNN. The 
hybrid model has delivered the predicting accuracy of 88.38% while the variations in performance were 
limited to 0.8 % which is very appealing. The proposed model is having the simplicity in design and can be 
extended to use for other types of streams of academic. 
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